[FLINK-26570][statefun] Remote module configuration interpolation #309

FilKarnicki · 2022-03-22T11:45:55Z

What is the purpose of the change

The goal of this PR is to add the ability to define ${placeholders} in the module.yaml's spec object, e.g.:

kind: com.foo.bar/test
spec:
  something: ${REPLACE_ME}
  transport:
    password: ${REPLACE_ME_WITH_A_SECRET}
    array:
      - ${REPLACE_ME}
      - sthElse

These placeholders are then replaced using

System properties
Environment variables
flink-conf.yaml entries with prefix 'statefun.module.global-config.'
Command line args

where (4) override (3) which override (2) which override (1).

Main changes are

RemoteModule resolves placeholders by traversing through the spec json node
ComponentJsonObject has a new constructor that allows the use of resolved spec json node (note: rawObjectNode still contains unresolved ${placeholders}, but does not seem to be used anywhere)

Verifying this change

Added tests for resolving placeholders, and a test for a value that is not found in the combined key-value config map, where an IllegalArgumentException is thrown.
Updated Java, Go, and JS e2e tests to use placeholders

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @public(Evolving): N/A
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

tillrohrmann

Thanks for creating this PR @FilKarnicki. I like the flexibility it gives to the user. The main thing I am asking myself is whether it gives users too much power so that they risk shooting themselves into the foot. I don't have a perfect answer for it yet.

tillrohrmann · 2022-03-28T09:15:25Z

docs/content/docs/modules/overview.md

+1. System properties
+2. Environment variables
+3. flink-conf.yaml entries with prefix 'statefun.module.global-config.' 
+4. Command line args


I think it is more common to give env variables precedence over flink-conf.yaml values.

I agree in principal. That said, because globalConfiguration is already an combination of args and flink-conf.yaml entries with the statefun.module.global-config. prefix, there's no easy way to put env variables in between them without affecting other parts of the system.

I made a start in my fork and the number of changes is pretty high for what we gain. Please have a look and let me know if I Should working on the change you mentioned in this comment

FilKarnicki@02cd6a9

tillrohrmann · 2022-03-28T09:36:15Z

docs/content/docs/modules/overview.md

+kind: io.statefun.endpoints.v2/http
+spec:
+  functions: com.example/*
+  urlPathTemplate: ${FUNC_PROTOCOL}://${FUNC_DNS}/{function.name}
+---
+kind: io.statefun.kafka.v1/ingress
+spec:
+  id: com.example/my-ingress
+  address: ${KAFKA_ADDRESS}:${KAFKA_PORT}
+  consumerGroupId: my-consumer-group
+  topics:
+    - topic: ${KAFKA_INGRESS_TOPIC}
+      (...)
+  properties:
+    - ssl.truststore.location: ${SSL_TRUSTSTORE_LOCATION}
+    - ssl.truststore.password: ${SSL_TRUSTSTORE_PASSWORD}
+    (...)


I really like the flexibility the templating mechanism gives to the user. The thing I am asking myself is whether it gives too much power so that the user can shoot himself into his foot.

The danger I see is that we add level of indirections that make it harder to reason about what the effective yaml will look like for a user. The underlying problem the issue wants to solve is to pass in information that is only available to the process that runs SF but not the user (e.g. secrets). I am wondering whether there is an alternative to achieve the same but with a bit less power (e.g. allowing substitution only for selected fields (also confusing)). I don't have a perfect answer here. I mainly wanted to hear your and @igalshilman's opinion here.

Maybe one idea could be to log the effective/resolved yaml somewhere so that the user sees what is actually run, if this does not happen already.

I like the idea of logging the effective yaml somewhere. We should probably make it an opt-in kind of a deal, since we don't want to be automatically logging secrets. I'll hold off on coding this until we hear from @igalshilman

FilKarnicki · 2022-05-26T11:52:10Z

I'm beginning to think that we can achieve the same with a 'jar -uf' at deploy time. This means that this PR can be closed, do you agree @igalshilman / @tillrohrmann ?

nicklester · 2024-03-20T19:37:03Z

I still see value in this PR. When deploying into a managed environment such as Kinesis Data Analytics on AWS it is not possible to use something like 'jar -uf' to modify the jar at deploy time. We see value in being able to pass a hostname for API Gateway as an environment variable, for instance, to simplify multi environment deployment in an AWS CDK stack.

[FLINK-26570][statefun] Remote module configuration interpolation

4f63228

FilKarnicki force-pushed the FLINK-26570_remote_module_configuration_interpolation_final branch from 3d3ff6f to 4f63228 Compare March 22, 2022 11:48

tillrohrmann reviewed Mar 28, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-26570][statefun] Remote module configuration interpolation #309

[FLINK-26570][statefun] Remote module configuration interpolation #309

FilKarnicki commented Mar 22, 2022

tillrohrmann left a comment

tillrohrmann Mar 28, 2022

FilKarnicki Apr 4, 2022

tillrohrmann Mar 28, 2022

FilKarnicki Apr 4, 2022

FilKarnicki commented May 26, 2022

nicklester commented Mar 20, 2024

[FLINK-26570][statefun] Remote module configuration interpolation #309

Are you sure you want to change the base?

[FLINK-26570][statefun] Remote module configuration interpolation #309

Conversation

FilKarnicki commented Mar 22, 2022

What is the purpose of the change

Main changes are

Verifying this change

tillrohrmann left a comment

Choose a reason for hiding this comment

tillrohrmann Mar 28, 2022

Choose a reason for hiding this comment

FilKarnicki Apr 4, 2022

Choose a reason for hiding this comment

tillrohrmann Mar 28, 2022

Choose a reason for hiding this comment

FilKarnicki Apr 4, 2022

Choose a reason for hiding this comment

FilKarnicki commented May 26, 2022

nicklester commented Mar 20, 2024